AITopics | spurious minima

Related mixed matrix-tensor models have also been studied in the context of text-analysis applications in [20, 19].

artificial intelligence, machine learning, minima, (19 more...)

Neural Information Processing Systems

Country:

Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.05)
Europe > France (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks: A Tale of Symmetry II

Neural Information Processing SystemsFeb-9-2026, 13:35:41 GMT

We study the optimization problem associated with fitting two-layer ReLU neural networks with respect to the squared loss, where labels are generated by a target network. We make use of the rich symmetry structure to develop a novel set of tools for studying families of spurious minima. In contrast to existing approaches which operate in limiting regimes, our technique directly addresses the nonconvex loss landscape for a finite number of inputs d and neurons k, and provides analytic, rather than heuristic, information.

artificial intelligence, machine learning, minima, (16 more...)

Neural Information Processing Systems

Country:

Europe > Sweden > Stockholm > Stockholm (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(4 more...)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

3a61ed715ee66c48bacf237fa7bb5289-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 03:25:14 GMT

co 2, eigenvalue, minima, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Europe > Sweden > Stockholm > Stockholm (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

3a61ed715ee66c48bacf237fa7bb5289-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 03:25:07 GMT

We consider the optimization problem associated with fitting two-layers ReLU networks with respect tothesquared loss, where labels aregenerated byatarget network.

artificial intelligence, machine learning, minima, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Europe > Sweden > Stockholm > Stockholm (0.05)
North America > United States > California (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)

Add feedback

2172fde49301047270b2897085e4319d-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 19:04:04 GMT

eigenvalue, threshold state, transition, (12 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland (0.04)
North America > Canada (0.04)
Europe > Italy > Lazio > Rome (0.04)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.65)

Add feedback

Annihilation of Spurious Minima in Two-Layer ReLU Networks

Neural Information Processing SystemsDec-25-2025, 17:25:51 GMT

We study the optimization problem associated with fitting two-layer ReLU neural networks with respect to the squared loss, where labels are generated by a target network. Use is made of the rich symmetry structure to develop a novel set of tools for studying the mechanism by which over-parameterization annihilates spurious minima through. Sharp analytic estimates are obtained for the loss and the Hessian spectrum at different minima, and it is shown that adding neurons can turn symmetric spurious minima into saddles through a local mechanism that does not generate new spurious minima; minima of smaller symmetry require more neurons. Using Cauchy's interlacing theorem, we prove the existence of descent directions in certain subspaces arising from the symmetry structure of the loss function. This analytic approach uses techniques, new to the field, from algebraic geometry, representation theory and symmetry breaking, and confirms rigorously the effectiveness of over-parameterization in making the associated loss landscape accessible to gradient-based methods. For a fixed number of neurons and inputs, the spectral results remain true under symmetry breaking perturbation of the target.

annihilation, name change, spurious minima, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.78)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.60)

Add feedback

Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks: A Tale of Symmetry II

Neural Information Processing SystemsDec-24-2025, 09:03:25 GMT

We study the optimization problem associated with fitting two-layer ReLU neural networks with respect to the squared loss, where labels are generated by a target network. We make use of the rich symmetry structure to develop a novel set of tools for studying families of spurious minima. In contrast to existing approaches which operate in limiting regimes, our technique directly addresses the nonconvex loss landscape for finite number of inputs $d$ and neurons $k$, and provides analytic, rather than heuristic, information. In particular, we derive analytic estimates for the loss at different minima, and prove that, modulo $O(d^{-1/2})$-terms, the Hessian spectrum concentrates near small positive constants, with the exception of $\Theta(d)$ eigenvalues which grow linearly with~$d$. We further show that the Hessian spectrum at global and spurious minima coincide to $O(d^{-1/2})$-order, thus challenging our ability to argue about statistical generalization through local curvature. Lastly, our technique provides the exact \emph{fractional} dimensionality at which families of critical points turn from saddles into spurious minima. This makes possible the study of the creation and the annihilation of spurious minima using powerful tools from equivariant bifurcation theory.

analytic study, spurious minima, two-layer relu neural network, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Add feedback

Analytic Characterization of the Hessian in Shallow ReLU Models: AT ale of Symmetry

Neural Information Processing SystemsOct-2-2025, 17:08:36 GMT

Much of the current effort in understanding the empirical success of artificial neural networks is concerned with the geometry of the associated nonconvex optimization landscapes. Of particular importance is the Hessian spectrum which characterizes the local curvature of the loss at different points in the space.

artificial intelligence, co 2, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.27)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback